feat(vortex-bench): wire SpatialBench into the bench orchestrator#8607
feat(vortex-bench): wire SpatialBench into the bench orchestrator#8607HarukiMoriarty wants to merge 1 commit into
Conversation
Signed-off-by: Nemo Yu <zyu379@wisc.edu>
a0218b2 to
fdb0872
Compare
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
168.9 µs | 205.5 µs | -17.85% |
| ❌ | Simulation | bitwise_not_vortex_buffer_mut[128] |
215.3 ns | 244.4 ns | -11.93% |
| ⚡ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
26.9 µs | 16.7 µs | +61.21% |
| ⚡ | Simulation | chunked_varbinview_canonical_into[(100, 100)] |
259.6 µs | 224.5 µs | +15.63% |
| ⚡ | Simulation | chunked_varbinview_into_canonical[(100, 100)] |
306.4 µs | 271.1 µs | +13.01% |
| ⚡ | Simulation | eq_i64_constant |
318.3 µs | 288.6 µs | +10.29% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing nemo/spatial-wire-vx-bench (fdb0872) with nemo/spatial-wkb (c1635cb)
Footnotes
-
4 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
Summary
Wires SpatialBench into the
vx-bench/bench-orchestratorpipeline so it can be run end-to-end like the other benchmarks (datagen → Parquet → Vortex conversion → query). It builds on the WKB datagen landed in #8598.Running command:
Limitation
DuckDB-only. For now SpatialBench queries use DuckDB-specific ST_* spatial SQL that DataFusion has no functions for yet. There is a a single ad-hoc entry in
BENCHMARK_ENGINES = { SPATIALBENCH: {DUCKDB} }.No dictionary encoding / compaction on the WKB column. WKB geometry blobs are large and effectively unique, so running the dictionary builder over them balloons memory (tens of GB) for zero compression gain. The normal compaction path is preserved for every other column on every other benchmark.
Queries 10, 11, 12 is timeout simply because DuckDB poorly support on Spatial index.
Performance
SF=1.0
SF=10